Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
1.
Braz. J. Psychiatry (São Paulo, 1999, Impr.) ; 45(6): 482-490, Nov.-Dec. 2023. tab, graf
Article in English | LILACS-Express | LILACS | ID: biblio-1533996

ABSTRACT

Objective: To develop a classification framework based on random forest (RF) modeling to outline the declarative memory profile of patients with panic disorder (PD) compared to a healthy control sample. Methods: We developed RF models to classify the declarative memory profile of PD patients in comparison to a healthy control sample using the Rey Auditory Verbal Learning Test (RAVLT). For this study, a total of 299 patients with PD living in the city of Rio de Janeiro (70.9% females, age 39.9 ± 7.3 years old) were recruited through clinician referrals or self/family referrals. Results: Our RF models successfully predicted declarative memory profiles in patients with PD based on RAVLT scores (lowest area under the curve [AUC] of 0.979, for classification; highest root mean squared percentage [RMSPE] of 17.2%, for regression) using relatively bias-free clinical data, such as sex, age, and body mass index (BMI). Conclusions: Our findings also suggested that BMI, used as a proxy for diet and exercises habits, plays an important role in declarative memory. Our framework can be extended and used as a prospective tool to classify and examine associations between clinical features and declarative memory in PD patients.

2.
Chinese Journal of Radiation Oncology ; (6): 138-144, 2023.
Article in Chinese | WPRIM | ID: wpr-993164

ABSTRACT

Objective:To evaluate the feasibility of predicting lung cancer target position by online optical surface motion monitoring.Methods:CT images obtained in different ways of stereotactic body radiotherapy (SBRT) plans from 16 lung cancer cases were selected for experimental simulation. The planned CT and the original target position were taken as the reference, and the 10 phases of CT in four dimension CT and each cone beam (CBCT) were taken as the floating objects, on which the floating target location was delineated. The binocular visual surface imaging method was used to obtain point cloud data of reference and floating image body surface, while the point cloud feature information was extracted for comparison. Based on the random forest algorithm, the feature information difference and the corresponding target area position difference were fitted, and an online prediction model of the target area position was constructed.Results:The model had a high prediction success rate for the target position. The variance explainded and root mean squared error ( RMSE) of left-right, superior-inferior, anterior-posterior directions were 99.76%, 99.25%, 99.58%, and 0.0447 mm, 0.0837 mm, 0.0616 mm, respectively. Conclusion:The online monitoring of lung SBRT target position proposed in this study is feasible, which can provide reference for online monitoring and verification of target position and dose evaluation in clinical radiotherapy.

3.
Acta Pharmaceutica Sinica ; (12): 1713-1721, 2023.
Article in Chinese | WPRIM | ID: wpr-978730

ABSTRACT

italic>Fusarium oxysporum widely exists in farmland soil and is one of the main pathogenic fungi of root rot, which seriously affects the growth and development of plants and often causes serious losses of cash crops. In order to screen out natural compounds that inhibit the activity of Fusarium oxysporum more economically and efficiently, random forest, support vector machine and artificial neural network based on machine learning algorithms were constructed using the information of known inhibitory compounds in ChEMBL database in this study. And the antibacterial activity of the screened drugs was verified thereafter. The results showed that the prediction accuracy of the three models reached 77.58%, 83.03% and 81.21%, respectively. Based on the inhibition experiment, the best inhibition effect (MIC = 0.312 5 mg·mL-1) of ononin was verified. The virtual screening method proposed in this study provides ideas for the development and creation of new pesticides derived from natural products, and the screened ononin is expected to be a potential lead compound for the development of novel inhibitors of Fusarium oxysporum.

4.
Journal of Environmental and Occupational Medicine ; (12): 565-570, 2023.
Article in Chinese | WPRIM | ID: wpr-973648

ABSTRACT

Background Phenolic compounds may adversely affect human health, but the current relevant studies are mostly limited to the impact of single phenolic compound exposure on human health, and there is still a lack of studies on the population-based association between combined exposure to multiple common phenolic compounds and dyslipidemia. Objective To explore the association of phenolic compound combined exposure and dyslipidemia based on principal component analysis-random forest (PCA-RF) strategy. Methods The data were from the National Health and Nutrition Examination Survey (2013–2016). A total of 1301 adult residents aged ≥ 20 years with complete information on demographics and lifestyle, urine phenol concentrations (bisphenol A, bisphenol F, bisphenol S, triclocarban, benzophenone, and triclosan), and serum concentrations of total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C) were included in this study. The concentrations of six urinary phenolic compounds were determined by solid phase extraction coupled with high performance liquid chromatography and tandem mass spectrometry, and the lipid indicators were determined by enzymatic methods. Principal component analysis combined with random forest model was used for model construction. First, principal component analysis was performed on 18 original variables including 6 phenolic compounds and 12 basic characteristic indicators, and then random forest model was established with dyslipidemia and its four evaluation indicators as dependent variables and the extracted principal components as independent variables, respectively. Results The PCA-RF analysis showed that bisphenol A, bisphenol F, and benzophenone may be important factors for dyslipidemia in the study subjects; bisphenol A, bisphenol F, and triclosan may be important factors for TC level in the study subjects; bisphenol A, bisphenol F, triclocarban, and benzophenone may be important factors for TG level in the study subjects; bisphenol A may be an important factor for LDL-C level in the study subjects; bisphenol F and benzophenone may be important factors for HDL-C level in the study subjects. Conclusion Phenolic compound exposure may be an important risk factor for the development of dyslipidemia. PCA-RF strategy can be effectively used to explore the association between phenolic compound exposure and dyslipidemia in the population.

5.
Journal of Central South University(Medical Sciences) ; (12): 213-220, 2023.
Article in English | WPRIM | ID: wpr-971388

ABSTRACT

OBJECTIVES@#Abdominal aortic aneurysm is a pathological condition in which the abdominal aorta is dilated beyond 3.0 cm. The surgical options include open surgical repair (OSR) and endovascular aneurysm repair (EVAR). Prediction of acute kidney injury (AKI) after OSR is helpful for decision-making during the postoperative phase. To find a more efficient method for making a prediction, this study aims to perform tests on the efficacy of different machine learning models.@*METHODS@#Perioperative data of 80 OSR patients were retrospectively collected from January 2009 to December 2021 at Xiangya Hospital, Central South University. The vascular surgeon performed the surgical operation. Four commonly used machine learning classification models (logistic regression, linear kernel support vector machine, Gaussian kernel support vector machine, and random forest) were chosen to predict AKI. The efficacy of the models was validated by five-fold cross-validation.@*RESULTS@#AKI was identified in 33 patients. Five-fold cross-validation showed that among the 4 classification models, random forest was the most precise model for predicting AKI, with an area under the curve of 0.90±0.12.@*CONCLUSIONS@#Machine learning models can precisely predict AKI during early stages after surgery, which allows vascular surgeons to address complications earlier and may help improve the clinical outcomes of OSR.


Subject(s)
Humans , Aortic Aneurysm, Abdominal/complications , Endovascular Procedures/methods , Retrospective Studies , Blood Vessel Prosthesis Implantation/adverse effects , Acute Kidney Injury/etiology , Machine Learning , Treatment Outcome , Postoperative Complications/etiology , Risk Factors
6.
Chinese Journal of Practical Nursing ; (36): 1829-1835, 2023.
Article in Chinese | WPRIM | ID: wpr-990414

ABSTRACT

Objective:To construct a hypoglycemia random forest prediction model for older adults with type 2 diabetes, and assess the model′s prognostication performance through internal and external verification.Methods:From August 2022 to January 2023, 300 older adults with type 2 diabetes in Beijing Hospital were selected. The demographic characteristics, medical history, laboratory tests, and other data of the patients were collected, and the data set was randomly divided into the training set and verification set in a ratio of 7∶3. The hypoglycemia prediction model for older adults with type 2 diabetes was constructed and optimized based on the random forest algorithm. The calibration curve was used to evaluate the model′s calibration, and the ROC was used to evaluate the model′s discrimination. The clinical applicability of the model was assessed by the decision curve analysis. The risk factors for hypoglycemia in the older adults were explored by prioritizing the contributions of variables in prediction. The Bootstrap method was used for internal validation, and the validation set was used for external validation.Results:Among the 300 older adults with type 2 diabetes, 128 cases (42.67%) experienced hypoglycemia within one week. The predictive contributions of risk factors in the model were ranked as follows: the number of episodes of hypoglycemia in one month, HDL-C, heart disease, diabetes knowledge and education, combination therapy, age, duration of diabetes, staple food restriction, glycosylated hemoglobin, and gender. The internal and external calibration curves of the hypoglycemia random forest model for the older adults with type 2 diabetes fluctuated around the diagonal, indicating that the calibration degree of the predictive model is good. The AUROC of internal verification was 0.823 (95% CI 0.752-0.894), the sensitivity and specificity were 0.867 and 0.698, respectively. The external verification was 0.859 (95% CI 0.817 - 0.902), and sensitivity and specificity were 0.789 and 0.804, respectively, showing that the overall discrimination of the prediction model was good. The DCA curves were far from the all-positive line and all-negative line, which indicated that the prediction model had good clinical applicability. Conclusions:The predictive effect of this model is good, and it is suitable for predicting the risk of hypoglycemia in older adults with type 2 diabetes, and it provides a reference for early hypoglycemia screening and predictive intervention for this kind of patients.

7.
Digital Chinese Medicine ; (4): 151-159, 2023.
Article in English | WPRIM | ID: wpr-987635

ABSTRACT

@#【Objective 】 To explore the influencing factors of Yang deficiency constitution in traditional Chinese medicine (TCM) from the perspective of mathematics with the use of calculation formulas, so as to protect patients from getting diseases caused by Yang deficiency constitution and provide suggestions for TCM disease prevention. 【Methods】  Based on the classification and determination criteria of TCM constitution implemented by China Association of Chinese Medicine, data for 24 solar terms from May 5, 2020(Start of Summer) to April 20, 2021 (Grain Rain) for the identification of Yang deficiency were collected by mobile constitution identification system. The grey correlation analysis method was used to determine the grey correlation degree of the factors influencing Yang deficiency constitution. In addition, a random forest model was constructed for the verification of the results from the grey correlation analysis, and for the evaluation of correlation degree between Yang deficiency constitution and its influencing factors. 【Results】  A total of 16 259 sets of data were collected from healthy or sub-healthy individuals aged from 18 to 60 years living in the central and northeastern parts of Sichuan Province(China) for the identification of TCM constitutions. After screening and preprocessing, a total of 544 sets of data for the identification of Yang deficiency constitution, involving 18 aspects of factors influencing Yang deficiency constitution. The results of the grey correlation analysis showed that there were 12 influencing factors whose grey correlation degree with Yang deficiency constitution was greater than 0.6. The accuracy of these 12 influencing factors with the training set and validation set of the Yang deficiency constitution random forest model were 98.39% and 93.12%, respectively. 【Conclusion】  In the sample data selected in this paper, grey correlation analysis is the appropriate technology to analyze the influencing factors of Yang deficiency constitution. It provides a new idea and a new methodological reference for the research and analysis of the influencing factors of TCM constitution.

8.
Journal of Environmental and Occupational Medicine ; (12): 349-354, 2023.
Article in Chinese | WPRIM | ID: wpr-969641

ABSTRACT

Background Aedes albopictus is the dominant mosquito species in residential areas in Shanghai. There are many types of small containers with accumulated water in residential areas, providing a large number of breeding environments for Aedes alpopicuts and leading to an increasing transmission risk of mosquito-borne diseases. Objective To use random forest to predict breeding of Aedes mosquitoes in small aquatic container habitat in two concentrated reconstruction communities of rural areas in Shanghai, and to understand associated influence of environmental factors on the breeding of Aedes mosquitoes in the process of urbanization.Methods Small-scale habitat surveys of Aedes mosquitoes were carried out in two suburb concentrated reconstruction communities (Community A and B) in Shanghai, and the environment where the habitat was located was recorded and analyzed in both communities. The habitat where eggs, larvae, or pupae were found was recorded as positive. Spatial weight matrix was applied on a household basis, and global Moran's I index was used to carry out spatial autocorrelation analysis on the small-scale habitat and positive habitat in the environment of the two communities. When Moran's I is greater than 0, it means that the data present a positive spatial correlation; when Moran's I is less than 0, it means that the data are spatially negatively correlated; when Moran's I is 0, the spatial distribution is random. Combining the results of P and Z values, we explored the spatial distribution characteristics of small-scale habitat and positive habitat in the community environment. Random forest algorithm in machine learning was used to classify and sort environmental-related factors, and predict the breeding of Aedes mosquitoes in small aquatic habitat; receiver operating characteristic (ROC) curve was used to carry out model fitting evaluation. Results The environmental factors including building location (χ2=23.35, P<0.001), open space (χ2=8.83, P=0.003), and having trees (χ2=11.02, P=0.001) had a significant impact on the positive rate of small-scale habitat. The results of spatial characteristics analysis showed that the global Moran's I index of small-scale habitat was −0.092 (Z=−1.09, P=0.274) in Community A and 0.034 (Z=0.52, P=0.602) in Community B, and the global Moran's I index of positive habitat was −0.092 (Z=−1.14, P=0.255) in Community A and 0.070 (Z=0.95, P=0.342) in Community B. Since the P values of Community A and B were greater than 0.1 and the Z values were between −1.65 and 1.65, for both small-scale habitat and positive habitat the spatial characteristics were randomly distributed and no significant spatial aggregation was found. In the fitted random forest algorithm classification prediction model with the top 10 characteristic factors of importance, the area under curve (AUC) value was 0.95, and the prediction fitting effect was satisfactory. The results of classification and sorting indicated that counts of household small-scale habitat and positive habitat were the most important factors for breeding. Conclusion The random forest model constructed by environmental factor indicators can be used to predict the breeding situation of Aedes mosquitoes in small-scale aquatic habitat, and provide a basis for scientific prevention and control of mosquito breeding for the target area.

9.
Chinese Journal of Medical Instrumentation ; (6): 396-401, 2023.
Article in Chinese | WPRIM | ID: wpr-982252

ABSTRACT

Ventricular fibrillation is the most common pathophysiological mechanism leading to cardiac arrest. If cardiac arrest can be rescued in time, the survival rate of patients can be greatly improved. Therefore, rapid and accurate identification of ventricular fibrillation is extremely important. This paper proposes an automatic detection algorithm for ventricular fibrillation based on random forest and BP (back propagation) neural network. Pass the ECG signal through a 6 s moving window, calculate 6 kinds of characteristic parameters according to the time-frequency domain information of the signal, use these 6 kinds of characteristic parameters as the input of the classifier, carry out classification and test, and give the authoritative experts in the database. A total of 44 cases of related data were used to evaluate the method. The results show that using the ten-fold cross-validation method, the accuracy of classification of ventricular fibrillation in the CU database (Creighton University Ventricular Tachyarrhythmia Database) and the AHA database (the American Heart Association Database) has reached 96.38% and 99.45%, which has certain applicability.

10.
Journal of Biomedical Engineering ; (6): 280-285, 2023.
Article in Chinese | WPRIM | ID: wpr-981540

ABSTRACT

The method of using deep learning technology to realize automatic sleep staging needs a lot of data support, and its computational complexity is also high. In this paper, an automatic sleep staging method based on power spectral density (PSD) and random forest is proposed. Firstly, the PSDs of six characteristic waves (K complex wave, δ wave, θ wave, α wave, spindle wave, β wave) in electroencephalogram (EEG) signals were extracted as the classification features, and then five sleep states (W, N1, N2, N3, REM) were automatically classified by random forest classifier. The whole night sleep EEG data of healthy subjects in the Sleep-EDF database were used as experimental data. The effects of using different EEG signals (Fpz-Cz single channel, Pz-Oz single channel, Fpz-Cz + Pz-Oz dual channel), different classifiers (random forest, adaptive boost, gradient boost, Gaussian naïve Bayes, decision tree, K-nearest neighbor), and different training and test set divisions (2-fold cross-validation, 5-fold cross-validation, 10-fold cross-validation, single subject) on the classification effect were compared. The experimental results showed that the effect was the best when the input was Pz-Oz single-channel EEG signal and the random forest classifier was used, no matter how the training set and test set were transformed, the classification accuracy was above 90.79%. The overall classification accuracy, macro average F1 value, and Kappa coefficient could reach 91.94%, 73.2% and 0.845 respectively at the highest, which proved that this method was effective and not susceptible to data volume, and had good stability. Compared with the existing research, our method is more accurate and simpler, and is suitable for automation.


Subject(s)
Humans , Random Forest , Bayes Theorem , Sleep Stages , Sleep , Electroencephalography/methods
11.
Biomedical and Environmental Sciences ; (12): 406-417, 2023.
Article in English | WPRIM | ID: wpr-981069

ABSTRACT

OBJECTIVE@#To explore the genotyping characteristics of human fecal Escherichia coli( E. coli) and the relationships between antibiotic resistance genes (ARGs) and multidrug resistance (MDR) of E. coli in Miyun District, Beijing, an area with high incidence of infectious diarrheal cases but no related data.@*METHODS@#Over a period of 3 years, 94 E. coli strains were isolated from fecal samples collected from Miyun District Hospital, a surveillance hospital of the National Pathogen Identification Network. The antibiotic susceptibility of the isolates was determined by the broth microdilution method. ARGs, multilocus sequence typing (MLST), and polymorphism trees were analyzed using whole-genome sequencing data (WGS).@*RESULTS@#This study revealed that 68.09% of the isolates had MDR, prevalent and distributed in different clades, with a relatively high rate and low pathogenicity. There was no difference in MDR between the diarrheal (49/70) and healthy groups (15/24).@*CONCLUSION@#We developed a random forest (RF) prediction model of TEM.1 + baeR + mphA + mphB + QnrS1 + AAC.3-IId to identify MDR status, highlighting its potential for early resistance identification. The causes of MDR are likely mobile units transmitting the ARGs. In the future, we will continue to strengthen the monitoring of ARGs and MDR, and increase the number of strains to further verify the accuracy of the MDR markers.


Subject(s)
Humans , Escherichia coli/genetics , Escherichia coli Infections/epidemiology , Multilocus Sequence Typing , Genotype , Beijing , Drug Resistance, Multiple, Bacterial/genetics , Anti-Bacterial Agents/pharmacology , Diarrhea , Microbial Sensitivity Tests
12.
Journal of Environmental and Occupational Medicine ; (12): 1232-1239, 2023.
Article in Chinese | WPRIM | ID: wpr-998746

ABSTRACT

Background Public places are frequently polluted by cigarette smoking, and there is a lack of accurate, real-time, and intelligent monitoring technology to identify smoking behavior. It is necessary to develop a tool to identify cigarette smoking behavior in public places for more efficient control of cigarette smoking and better indoor air quality. Objective To construct a model for recognizing cigarette smoking behavior based on real-time indoor concentrations of PM2.5 in public places. Methods Real-time indoor PM2.5 concentrations were measured for at least 7 continuous days in 10 arbitrarily selected places (6 public service providers and and 4 office or other places) from Oct. to Nov. 2022 in Pudong New Area, Shanghai. Indoor nicotine concentrations were monitored with passive samplers simultaneously. Outdoor PM2.5 concentration data were obtained from three municipal environmental monitoring stations which were nearest to each monitoring point during the same period. Mann-Whitney U test was used to compare indoor and outdoor means of PM2.5 concentrations, and Spearman rank correlation was used to analyze indoor PM2.5 and nicotine concentrations. An interactive plot and a random forest model was applied to examine the association between video observation validated indoor smoking behavior and real-time indoor PM2.5 concentrations in an Internet cafe. Results The average indoor PM2.5 concentration in the places providing public services [(97.5±149.3) µg·m−3] was significantly higher than that in office and other places [(19.8±12.2) µg·m−3] (P=0.011). The indoor/outdoor ratio (I/O ratio) of PM2.5 concentration in the public service providers ranged from 1.1 to 19.0. Furthermore, the indoor PM2.5 concentrations in the 10 public places were significantly correlated with the nicotine concentrations (rs=0.969, P<0.001). Among them, the top 3 highly polluted places were Internet cafes, chess and card rooms, and KTV. The results of random forest modeling showed that, for synchronous real-time PM2.5 concentration, the area under the curve (AUC) was 0.66, while for PM2.5 concentration at a lag of 4 min after the incidence of smoking behavior, the AUC increased to 0.72. Conclusion The indoor PM2.5 concentrations in public places are highly correlated with smoking behavior. Based on real-time indoor PM2.5 monitoring, a preliminary recognition model for smoking behavior is constructed with acceptable accuracy, indicating its potential values applied in smoking control and management in public places.

13.
Cancer Research and Clinic ; (6): 596-604, 2023.
Article in Chinese | WPRIM | ID: wpr-996281

ABSTRACT

Objective:To investigate the factors influencing the prognosis of anaplastic thyroid cancer (ATC) and to evaluate the application value of established random survival forest (RSF) model in the prognosis prediction of ATC.Methods:A total of 707 ATC patients diagnosed by histopathology in the Surveillance, Epidemiology and End Results (SEER) database of the National Cancer Institute from 2004 to 2015 were selected and randomly divided into the training set (495 cases) and the validation set (212 cases). Univariate Cox regression risk model was used to analyze the related factors affecting overall survival (OS) of patients in the training set. The multivariate Cox proportional risk model based on the minimum Akaike information criterion (AIC) was used to analyze the above variables and then the variables were screened out. The traditional Cox model for predicting OS was constructed based on the screened variables. The RSF algorithm was used to analyze the variables with P < 0.05 in the univariate Cox regression analysis, and 5 important features were selected. Multivariate Cox proportional risk model was selected based on the minimum AIC. Then the RSF-Cox model for predicting OS was constructed by using screened variables. The time-dependent receiver operating characteristic (tROC) curve and the area under the curve (AUC), calibration curve, decision curve and integrated Brier score (IBS) in the training set and the validation set were used to evaluate the prediction performance of the models. Results:Univariate Cox regression analysis showed that age, chemotherapy, lymph node metastasis, radiotherapy, surgical method, tumor infiltration degree, tumor number, tumor diameter and diagnosis time were factors affecting the prognosis of ATC (all P < 0.05). Multivariate Cox regression analysis based on minimal AIC (4 855.8) showed that younger age (61-70 years vs. > 80 years: HR = 0.732, 95% CI 0.56-0.957, P = 0.023; ≤ 50 years vs. > 80 years: HR = 0.561, 95% CI 0.362-0.87, P = 0.010), receiving chemotherapy (receiving or not: HR = 0.623, 95% CI 0.502-0.773, P < 0.001), receiving radiotherapy (receiving or not: HR = 0.695, 95% CI 0.559-0.866, P = 0.001), receiving surgery (lobectomy, no surgery or unknown: HR = 0.712, 95% CI 0.541-0.939, P = 0.016; total resection or subtotal resection vs. no surgery or unknown: HR = 0.535, 95% CI 0.436-0.701, P < 0.001), and tumor diameter (≤ 2 cm vs. > 6 cm: HR = 0.495, 95% CI 0.262-0.938, P = 0.031; > 2 cm and ≤ 4 cm vs. > 6 cm: HR = 0.714, 95% CI 0.520-0.980, P = 0.037; > 4 cm and ≤ 6 cm vs. > 6 cm: HR = 0.699, 95 % CI 0.545-0.897, P = 0.005) were independent protective factors for OS of ATC patients. Lymph node metastasis (N 1 unknown vs. N 0: HR = 1.664, 95% CI 1.158-2.390, P = 0.006; N 1b: HR = 1.312, 95% CI 1.029-1.673, P = 0.028), more aggressive tumor infiltration degree (group 3 vs. group 1: HR = 1.492, 95% CI 1.062-2.096, P = 0.021; group 4 vs. group 1: HR = 1.636, 95% CI 1.194 - 2.241, P = 0.002) were independent risk factors for OS of ATC patients. Although diagnosis time was not statistically significant (2010-2015 vs.2004-2009: HR = 1.166, 95% CI 0.962-1.413, P = 0.118), the inclusion of it could improve the efficacy of the traditional Cox model. RFS algorithm was used to select out 5 important variables: surgical method, tumor diameter, age group, chemotherapy, and tumor number. Multivariate Cox regression analysis based on minimum AIC (4 884.6) showed that chemotherapy (receiving or not: HR = 0.574, 95% CI 0.476-0.693, P < 0.001), surgical method (lobectomy, no surgery or unknown: HR = 0.730, 95% CI 0.567-0.940, P = 0.015; total resection or subtotal resection vs. no surgery or unknown: HR = 0.527, 95% CI 0.423-0.658, P < 0.001), tumor diameter (≤ 2 cm vs. > 6 cm: HR = 0.428, 95% CI 0.231-0.793, P = 0.007; > 2 cm and ≤ 4 cm vs. > 6 cm: HR = 0.701, 95% CI 0.513-0.958, P = 0.026; > 4 cm and ≤ 6 cm vs. > 6 cm: HR = 0.681, 95% CI 0.536-0.866, P = 0.002) were independent factors for OS of ATC patients. RSF-Cox model was constructed based on 3 variables. The tAUC curve analysis showed that RSF-Cox model for predicting the 6-month, 12-month, and 18-month OS rates were 93.56, 92.62, and 90.80, respectively in the training set, and 93.05, 92.47, and 90.20, respectively in the validation set; in the traditional Cox model, the corresponding OS rates were 89.00, 87.76, 85.24, respectively in the training set, and 86.22, 83.68, 82.86, respectively in the validation set. When predicting OS rate at 6-month, 12-month and 18-month, the calibration curve of RSF-Cox model was closer to 45° compared with that of traditional Cox model, and the clinical net benefit of decision curve in RSF-Cox model was higher than that in traditional Cox model. The IBS of RSF-Cox model (0.089) was lower than that of traditional Cox model (0.111). Conclusions:The RSF model based on chemotherapy, surgical method and tumor diameter can effectively predict the OS of ATC patients.

14.
Chinese Journal of Health Management ; (6): 41-46, 2023.
Article in Chinese | WPRIM | ID: wpr-993643

ABSTRACT

Objective:To explore indicators related to visceral fat index by constructing a random forest model.Methods:In this cross-sectional study, the laboratory measures and body composition analysis records of 617 hospital employees (in-service and retired) who underwent physical examination in Heilongjiang Provincial Hospital Health Management Center from March to September 2021 were selected. The subjects were divided into a training set ( n=411) and a test set ( n=206) with the ratio of 2∶1. A total of 110 predictors were included in the model. The model was constructed with the training set and was evaluated with the test set. The optimal number of nodes and decision trees were selected to evaluate the prediction performance of the optimal model. And the top 10 relatively important factors were selected for further investigation. The 617 participants were further divided in to groups according to the visceral fat index: the normal or high visceral fat index group, and the differences of the top 10 relatively important factors were further compared between the two groups. Results:The optimal number of nodes of the final random forest model was 39 and the number of decision trees was 300. The accuracy, precision, sensitivity and specificity of the model was 83.3%, 73.9%, 89.4% and 78.7%, respectively. The area under the receiver operating characteristic curve and 95% confidence interval of the model was 0.881 (0.832-0.931). The top 10 relatively important factors in the model were body mass index, gender, age, serum uric acid, red blood cell count, monocyte cell count, C-peptide, carcinoembryonic antigen, glycosylated hemoglobin and glutamyl transpeptidase. There were significant differences in the up-mentioned 10 indicators between the subjects with normal and high visceral fat index (all P<0.05). Conclusions:The random forest model built in this study has good performance in predicting visceral fat index, and visceral fat is related with changes in liver function, pancreas function and immune function.

15.
Rev. invest. clín ; 74(6): 314-327, Nov.-Dec. 2022. tab, graf
Article in English | LILACS-Express | LILACS | ID: biblio-1431820

ABSTRACT

ABSTRACT Background: The coronavirus disease (COVID-19) is an infectious disease caused by the SARS-CoV-2 virus and is responsible for nearly 6 million deaths worldwide in the past 2 years. Machine learning (ML) models could help physicians in identifying high-risk individuals. Objectives: To study the use of ML models for COVID-19 prediction outcomes using clinical data and a combination of clinical and metabolic data, measured in a metabolomics facility from a public university. Methods: A total of 154 patients were included in the study. "Basic profile" was considered with clinical and demographic variables (33 variables), whereas in the "extended profile," metabolomic and immunological variables were also considered (156 characteristics). A selection of features was carried out for each of the profiles with a genetic algorithm (GA) and random forest models were trained and tested to predict each of the stages of COVID-19. Results: The model based on extended profile was more useful in early stages of the disease. Models based on clinical data were preferred for predicting severe and critical illness and death. ML detected trimethylamine N-oxide, lipid mediators, and neutrophil/lymphocyte ratio as important variables. Conclusion: ML and GAs provided adequate models to predict COVID-19 outcomes in patients with different severity grades.

16.
Chinese Journal of Radiological Medicine and Protection ; (12): 966-972, 2022.
Article in Chinese | WPRIM | ID: wpr-993034

ABSTRACT

Objective:To establish a prediction model using the random forest (RF) and support vector machine (SVM) algorithms to achieve the numerical and classification predictions of the gamma passing rate (GPR) for volumetric arc intensity modulation (VMAT) validation.Methods:A total of 258 patients who received VMAT radiotherapy in the 1 st Affiliated Hospital of Wenzhou Medical University from April 2019 to August 2020 were retrospectively selected for patient-specific QA measurements, including 38 patients who received VMAT radiotherapy for head and neck, and 220 patients who received VMAT radiotherapy for chest and abdomen. Thirteen complexity parameters were extracted from the patient′s VMAT plans and the GPRs for VMAT validation under the analysis criteria of 3%/3 mm and 2%/2 mm were collected. The patients were randomly divided into a training cohort (70%) and a validation cohort (30%) , and the complexity parameters for the numerical and classification predictions were screened using the RF and minimum redundancy maximum correlation (mRMR) method, respectively. Complexity models and mixed models were established using PTV volume, subfield width, and smoothness factors based on the RF and SVM algorithms individually. The prediction performance of the established models was analyzed and compared. Results:For the validation cohort, the GPR numerical prediction errors of the complexity models based on RF and SVM under the two analysis criteria are as follows. The root-mean-square errors (RMSEs) under the analysis criterion of 3%/3 mm were 1.788% and 1.753%, respectively; the RMSEs under the analysis criterion of 2%/2 mm were 5.895% and 5.444%, respectively; the mean absolute errors (MAEs) under the analysis criterion of 3%/3 mm were 1.415% and 1.334%, respectively, and the MAEs under the analysis criteria of 2%/2 mm were 4.644% and 4.255%, respectively. For the validation cohort, the GPR numerical prediction errors of the mixed models based on RF and SVM under the two analysis criteria were as follows. The RMSEs under the analysis criterion of 3%/3 mm were 1.760% and 1.815%, respectively; the RMSEs under the analysis criterion of 2%/2 mm were 5.693% and 5.590%, respectively; the MAEs under the analysis criterion of 3%/3 mm were 1.386% and 1.319%, respectively, and the MAEs under the analysis criteria of 2%/2 mm were 4.523% and 4.310, respectively. For the validation cohort, the AUC result of the GPR classification prediction of the complexity models based on RF and SVM were 0.790 and 0.793, respectively under the analysis criterion of 3%/3 mm and were 0.763 and 0.754, respectively under the analysis criterion of 2%/2 mm. For the validation cohort, the AUC result of the GPR classification prediction of the mixed models based on RF and SVM were 0.806 and 0.859, respectively under the analysis criterion of 3%/3 mm and were 0.796 and 0.796, respectively under the analysis criterion of 2%/2 mm cohort.Conclusions:Complexity models and mixed models were developed based on the RF and SVM method. Both types of models allow for the numerical and classification predictions of the GPRs of VMAT radiotherapy plans under analysis criteria of 3%/3 mm and 2%/2 mm. The mixed models have higher prediction accuracy than the complexity models.

17.
Journal of Public Health and Preventive Medicine ; (6): 1-5, 2022.
Article in Chinese | WPRIM | ID: wpr-920363

ABSTRACT

Objective To compare the effects of random forest and SARIMA (Seasonal Autoregressive Integrated Moving Average) on predicting incidence rate of brucellosis. Methods Using Brucellosis cases reported in the China Disease Prevention and Control Information System from 2005 to 2017, two models, random forest and SARIMA, were established for training and forecasting, and the forecasting results of the two models were compared. Results The R2 (R Squared) and RMSE (Root Mean Squared Error) of SARIMA model and random forest model are 0.904, 0.034351, 0.927 and 0.03345 respectively. Conclusion Both models have high prediction accuracy and can predict the incidence of brucellosis. Random forest prediction is a little bit better than SARIMA model and has more practical value.

18.
Cancer Research and Clinic ; (6): 726-730, 2022.
Article in Chinese | WPRIM | ID: wpr-958924

ABSTRACT

Objective:To investigate the predictive value of established random forest model for pathologic complete response (pCR) in breast cancer patients undergoing neoadjuvant chemotherapy.Methods:The clinicopathologic data of 142 primary breast cancer patients undergoing breast-conserving surgery or modified radical mastectomy after neoadjuvant chemotherapy from Cangzhou Central Hospital between January 2010 and October 2021 were retrospectively analyzed. Histologically, breast and axillary lymph node without residual infiltrated tumors was treated as pCR. The patients were divided into pCR group (23 cases) and non-pCR group (119 cases) according to whether patients achieved pCR or not, and the differences of clinicopathologic data between the two groups were compared. The risk factors affecting pCR were identified by using logistic regression analysis, random forest model was established by using random forest function of R statistical software, and Gini index of random forest algorithmic was used to order the importance of variables. The receiver operating characteristic (ROC) curve was used to assess the value of random forest model in predicting the efficacy of neoadjuvant chemotherapy.Results:The overall pCR ratio after neoadjuvant chemotherapy was 16.20% (23/142). The proportion of tumor diameter ≤5 cm, negative axillary lymph node, negative human epidermal growth factor receptor 2 (HER2), Ki-67 positive index >20%, histological grade 2, and neoadjuvant chemotherapy regimens including targeted therapy in pCR group was higher than that in non-pCR group, and the difference was statistically significant (all P < 0.05). Univariate logistic regression analysis showed that tumor diameter, axillary lymph node, HER2, Ki-67, histological grade, and neoadjuvant chemotherapy regimens were related with pCR (all P < 0.05). Multivariate logistic regression analysis showed that tumor diameter >5 cm ( OR = 5.85, 95% CI 1.28-26.67, P = 0.022), positive axillary lymph node ( OR = 11.22, 95% CI 1.84-68.42, P = 0.009), positive HER2 ( OR = 7.35, 95% CI 1.45-37.26, P = 0.016), Ki-67 positive index ≤20% ( OR = 1.03, 95% CI 1.01-1.06, P = 0.017), histological grade 3 ( OR = 7.37, 95% CI 1.24-43.86, P = 0.028), and non-targeted therapy ( OR = 0.02, 95% CI 0.00-0.25, P = 0.003) were independent risk factors of pCR. Random forest algorithm showed that the importance order of risk factors of pCR was successively Ki-67 low expression, positive axillary lymph node, tumor diameter >5 cm, positive HER2, non-targeted therapy and histological grade 3. The area under the ROC curve of random forest model for predicting pCR was 0.84 (95% CI 0.74-0.93); the sensitivity was 87.0% and specificity was 72.3% when the optimal cut-off value was 0.88. Conclusions:Low expression of Ki-67, positive axillary lymph node, tumor diameter >5cm, positive HER2, non-targeted therapy and histological grade 3 are risk factors of pCR in breast cancer patients after neoadjuvant chemotheapy. Random forest model helps to predict pCR in breast cancer patients after neoadjuvant chemotheapy.

19.
Chinese Journal of Radiology ; (12): 1001-1008, 2022.
Article in Chinese | WPRIM | ID: wpr-956754

ABSTRACT

Objective:To explore the predictive value of random forest regression model for pulmonary function test.Methods:From August 2018 to December 2019, 615 subjects who underwent screening for three major chest diseases in Shanghai Changzheng Hospital were analyzed retrospectively. According to the ratio of forced expiratory volume in the first second to forced vital capacity (FEV 1/FVC) and the percentage of forced expiratory volume in the first second to the predicted value (FEV 1%), the subjects were divided into normal group, high risk group and chronic obstructive pulmonary disease (COPD) group. The CT quantitative parameter of small airway was parameter response mapping (PRM) parameters, including lung volume, the volume of functional small airways disease (PRMV fSAD), the volume of emphysema (PRMV Emph), the volume of normal lung tissue (PRMV Normal), the volume of uncategorized lung tissue (PRMV Uncategorized) and the percentage of the latter four volumes to the whole lung (%). ANOVA or Kruskal Wallis H was used to test the differences of basic clinical characteristics (age, sex, height, body mass), pulmonary function parameters and small airway CT quantitative parameters among the three groups; Spearman test was used to evaluate the correlation between PRM parameters and pulmonary function parameters. Finally, a random forest regression model based on PRM combined with four basic clinical characteristics was constructed to predict lung function. Results:There were significant differences in the parameters of whole lung PRM among the three groups ( P<0.001). Quantitative CT parameters PRMV Emph, PRMV Emph%, and PRMV Normal% showed a moderate correlation with FEV 1/FVC ( P<0.001). Whole lung volume, PRMV Normal,PRMV Uncategorized and PRMV Uncategorized% were strongly or moderately positively correlated with FVC ( P<0.001), other PRM parameters were weakly or very weakly correlated with pulmonary function parameters. Based on the above parameters, a random forest model for predicting FEV 1/FVC and a random forest model for predicting FEV 1% were established. The random forest model for predicting FEV 1/FVC predicted FEV 1/FVC and actual value was R 2=0.864 in the training set and R 2=0.749 in the validation set. The random forest model for predicting FEV 1% predicted FEV 1% and the actual value in the training set was R 2=0.888, and the validation set was R 2=0.792. The sensitivity, specificity and accuracy of predicting FEV 1% random forest model for the classification of normal group from high-risk group were 0.85(34/40), 0.90(65/72) and 0.88(99/112), respectively; and the sensitivity, specificity and accuracy of predicting FEV 1/FVC random forest model for differentiating non COPD group from COPD group were 0.89(8/9), 1.00 (112/112) and 0.99(120/121), respectively. While the accuracy of two models combination for subclassification of COPD [global initiative for chronic obstructive lung disease (GOLD) Ⅰ, GOLDⅡ and GOLD Ⅲ+Ⅳ] was only 0.44. Conclusions:Small airway CT quantitative parameter PRM can distinguish the normal population, high-risk and COPD population. The comprehensive regression prediction model combined with clinical characteristics based on PRM parameter show good performance differentiating normal group from high risk group, and differentiating non-COPD group from COPD group. Therefore, one-stop CT scan can evaluate the functional small airway and PFT simultaneously.

20.
Japanese Journal of Drug Informatics ; : 145-153, 2022.
Article in Japanese | WPRIM | ID: wpr-966102

ABSTRACT

Objective: Currently, limited information is available on the milk transfer properties of drugs when consumed by lactating women. Therefore, we aim to construct a prediction model of milk transfer of drugs using machine learning methods.Methods: We obtained data from Hale’s Medications & Mothers’ Milk (MMM) and SciFinder®, and then constructed the datasets. The physicochemical and pharmacokinetic data were used as feature variables with M/P ratio ≥ 1 and M/P ratio < 1 as the objective variables, classified into two groups as the classification of milk transferability. In this study, analyses were conducted using machine learning methods: logistic regression, linear support vector machine (linear SVM), kernel method support vector machine (kernel SVM), random forest, and k-nearest neighbor classification. The results were compared to those obtained with the linear regression equation of Yamauchi et al. from a previous study. The analysis was performed using scikit-learn (version 0.24.2) with python (version 3.8.10).Results: Model construction and validation were performed on the training data comprising 159 drugs. The results revealed that the random forest had the highest accuracy, area under the receiver operating characteristic curve (AUC), and F value. Additionally, the results with test data A and B (n = 36, 31), which were not used for training, showed that both F value and accuracy for the random forest and the kernel method SVM exceeded those with the linear regression equation of Yamauchi et al. Conclusion: We were able to construct a predictive model of milk transferability with relatively high performance using a machine learning method capable of nonlinear separation. The predictive model in this study can be applied to drugs with unknown M/P ratios for providing a new source of information on milk transfer.

SELECTION OF CITATIONS
SEARCH DETAIL